WORLD INTELLECTUAL PROPERTY ORG, 
International Bureau 



Afi^^^ON 




PCT 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) Internationa] Patent Classification 6 
G05B 13/02 



Al 



(11) International Publication Number: 
(43) International Publication Date: 



WO 97A2300 

3 April 1997 (03.04.97) 



(21) International Application Number: PCI7US96715277 

(22) International Filing Date: 25 September 1996 (25.09.96) 



(30) Priority Data: 

60/004,328 



26 September 1995 (26.09.95) US 



(71)(72) Applicant and Inventor: BOIQUAYE, William, J., N- 
O. [GH/US]; Apartment No. 7, 33 Yorkshire Terrace, 
Shrewsbury, MA 01545 (US). 

(74) Agent: TOUW, Theodore, R.; 4 Forest Lane, Westford. VT 
05494 (US). 



(81) Designated States: CA, JP, US, European patent (AT, BE, CH, 
DE, DK, ES. FI, FR. GB, GR, EE, IT, LU, MC, NL, PT, 
SE). 



Published 

With international search report. 

Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 



(54) Title: ADAPTIVE CONTROL PROCESS AND SYSTEM 



^Lz^hs^nl _1L_ 



(57) Abstract 

A system for adapuvely controlling a wide variety of complex processes, despite changes in process parameters and despite both 
sudden and systematic drifts in the process, uses response surfaces described by quadratic equations or polynomials of any order. The 
system has means of estimating the dynamic component of a drifting process or system and thereby identifies the trend of output response 
variables of the controlled process. Using this information, the system predicts future outputs based on a history of past and present inputs 
and outputs, thereby recommending the necessary control action or recipe (set of input parameters) to cancel out the drifting trend. A 
specific embodiment is a system for the adaptive control of photoresist thickness, uniformity, and dispense volume in the spin coating of 
wafers in integrated circuit manufacturing. Methods used in the adaptive control system are adaptable to control many processes not readily 
modeled by physical equations. 
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DESCRIPTION 
ADAPTIVE CONTROL PROCESS AND SYSTEM 



TECHNICAL FIELD 

This invention relates generally to automatic control systems. More particularly, it relates 
to systems and methods for control of processes using non-linear approximations to generate 
models of response variables. 

5 BACKGROUND ART 

In today's industry, it is desirable to have processes that are easily adaptable to different 
process specifications and targets while using the same production equipment. Moreover, more 
stringent quality specifications make it ever more difficult to use trial-and-error methods to attain 
these desired objectives. A control system enabling high process adaptability for different product 
10 specifications, even in the face of changing manufacturing conditions, is of special advantage. 
Such a control system's control strategy should not only allow for changes in product 
specifications, but also it should be able to recover the process after a maintenance operation or 
unknown disturbance It should also be able to compensate for slow drifts and sudden shifts in a 
process and should have the ability to incorporate both economic and quality specifications in its 

15 objective criterion. 

A common practice in industry today is to dedicate equipment to specific processes and 
produce the same product using the same recipe. However, with the incidence of cluster tools that 
are programmable to do various tasks (especially in the semiconductor industry), the need for 
flexibility and adaptability is widely felt. For instance, a change in a batch of chemical used in a 

20 process or a change in ambient temperature could result in manufacturing products not within 
specifications if the recipe is not changed appropriately. 

At present, the process control methods practiced in industry are statistical process control 
(SPC) and automatic feedback control. These two methods are used virtually independent of one 
another and each is unable to address all the control issues mentioned above. The concerns 

25 discussed above have necessitated the development of this invention. 

The application of adaptive control techniques in the aerospace industry is widespread and 
has been applied to autopilot design and other navigation equipment. Unfortunately, this has not 
been the case in many other industries. Even though the growth of the semiconductor industry has 
brought along with it many advances in science and technology, and especially advances in the 
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computer industry, there has been relatively little process automation in the semiconductor 
fabrication industry. Thus, although one may see robots working in an assembly line, yet often 
most of the processes are running open loop. Such processes can continue to produce products 
until a large number of products that are out of specifications are manufactured and tested before 
5 an SPC system can issue a warning. 

At present, some of the most popular process control methods practiced in industry are 
statistical process control (SPC), factorial design, and automatic feedback control. In SPC, a 
history of product statistics is measured and plotted together with the limits for acceptable 
products. Then when products consistently go outside of this range it signifies that the process has 

10 changed and action needs to be taken to rectify this process change In factorial design, 
combinations of experiments are conducted off-line to determine the interdependencies of the 
variables and to generate linear regression models for further analytical work . Automatic feedback 
control is a well known field that seeks to control the outputs of system to track a reference input 
or such similar objective This field has been well formalized for linear systems but the application 

15 of feedback theory to nonlinear processes and systems is quite a challenge. It is not surprising 
therefore that there have not been many advances in the area of process automation whereby 
processes are run in a closed loop without operator intervention Since most manufacturing lines 
use machines, raw materials, and process chemicals, the modeling of the entire process is often 
intractable. Not only are typical real manufacturing processes very complex, but also they typically 

20 are nonlinear and not amenable to exact closed form solutions Hence, methods such as automatic 
feedback control which rely on analytical mathematical models derived from differential equations 
of the process are limited. Even though factorial design can provide information about the variable 
interdependencies, yet its operation in a closed loop without the involvement of a human operator 
has not been attained heretofore. 

25 Methods using ULTRAMAX software for control optimization have been described by C. 

W. Moreno, in the article "Self-Learning Optimization Control Software," (Instrument Society of 
America Proceedings, Research Triangle Park, North Carolina, June 1986) and 
C. W. Moreno and S. P. Yunker in the article "ULTRAMAX: Continuous Process Improvement 
Through Sequential Optimization" (Electric Power Research Institute, Palo Alto, CA, 1992) 

30 Other related publications are the article by C. W. Moreno "A Performance Approach to Attribute 
Sampling and Multiple Action Decisions" (AliE Transactions. Sept. 1979, pp. 183-197) and C. W. 
Moreno, "Statistical Progress Optimization" (P-Q System Annual Conference, Dayton, Ohio, Aug 
19-21, 1987, pp. 1-14). E. Sachs, A. Hu, and A. Ingolfsson, in an article entitled "Run by Run 
Process Control: Combining SPC and Feedback Control" (IEEE Transactions on Semiconductor 

35 Manufacturing, October 1991) discussed an application combining feedback and statistical process 
control, which used parallel design of experiments (PDOE) techniques in combination with linear 
run-by-run controllers. 

U.S. Pat. No. 3,638,089 to Gabor discloses a speed control system for a magnetic disk 
drive having high- and low-level speed means A feedback control loop compares index marks 
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from a disk unit in conjunction with a counter unit driven by an oscillator to provide a reference 
level to drive a DC drive motor between a high-level speed above its normal speed, and a low-level 
speed below its normal speed. An open-loop system also provides high-level and normal speeds. 
The open-loop system includes a voltage-controlled oscillator (VCO), an amplifier, and an AC 
5 drive motor. U.S. Pat. No. 5,412,519 to Buettner et al. discloses a disk storage device which 
optimizes disk drive spindle speed during low power mode. This system optimizes power savings 
to the characteristics of the particular drive. A transition speed is recalibrated periodically, and 
adaptive control can be implemented in this system by altering the time between recalibration 
cycles, extending the time if little or no change has occurred, or shortening the time when a sample 

10 sequence indicates changing status or conditions. 

U.S. Pat. No. 5,067,096 to Olson et al. discloses a target engagement system for 
determining proximity to a target. This system uses target motion analysis to determine a target 
engagement decision for ground targets such as vehicles. The input to the engagement system is 
the target azimuth as a function of time. The target is estimated to be within range or out-of-range 

15 based on calculation of a ratio of time intervals of crossing specified target azimuth sectors. 

U.S. Pat. No. 5,144,595 to Graham et al. discloses an adaptive statistical filter for target 
motion analysis noise discrimination. The adaptive statistical filter includes a bank of Kalman 
filters, a sequential comparator module, and an optimum model order and parameter estimate 
module. 

20 U.S. Pat. No. 5,369,599 to Sadjadi et al. discloses a signal metric estimator for an 

automatic target recognition (ATR) system. A performance model in the form of a quadratic 
equation is partially differentiated with respect to a parameter of the ATR, and the partial 
differentiation allows solution for an estimated metric. 

U.S. Pat. No. 5,513,098 to Spall et al. discloses a method of developing a controller for 

25 general (nonlinear) discrete-time systems, where the equations governing the system are unknown 
where a controller is estimated without building or assuming a model for the system. The 
controller is constructed through the use of a function approximator (FA) such as a neural network 
or polynomial. This involves the estimation of the unknown parameters within the FA through the 
use of a stochastic approximation that is based on a simultaneous perturbation gradient 

30 approximation. 

Thus, a variety of methods for automatic control and especially for automatic target 
recognition, and systems using the methods have been developed for specific purposes, some of 
which do not depend on analytic mathematical models such as differential equations. Some of the 
methods used in the background art cannot deal with fast-drifting systems, and some rely on small 
35 perturbations of the input variables, so that a resulting goal function must lie within a limited range 
around the desired trajectory. 
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DISCLOSURE OF THE INVENTION 

This invention provides features that enhance existing methods and provides new methods, 
resulting in a novel system that is extremely flexible and versatile in its applications. The methods 
of this invention can be applied, not only to systems and processes which can be modeled by 
5 differential equations, but also to processes which are described by quadratic or higher-order 
polynomial models. This invention is described in a Ph.D. dissertation entitled "Adaptive Control 
of Photoresist Thickness, Uniformity, and Dispense Volume in the Spin Coating of Wafers," 
submitted by the present inventor on 27 September 1995 to the University of Vermont, the entire 
disclosure of which dissertation is incorporated herein by reference. This dissertation is available 
10 to the public at the Research Annex, Baiiey Howe Library, University of Vermont, Burlington, 
Vermont. 
Nomenclature 

The term "recipe" as used in this specification, means a vector or ordered set of input 
variables for a process to be controlled. 

15 

In most manufacturing operations, a tremendous amount of information on processes can 
be recorded and stored as historical information in databases. The approach of this invention takes 
advantage of such historical information. This is done by feeding historical data into a run-by-run 
sequential design of experiment (RBR SDOE) optimization routine, continuing with the 

20 optimization process and finally identifying the optimum operating point Then models (linear or 
nonlinear) of the response variables, in terms of the input variables (recipe), can be generated at the 
optimum operating point This RBR SDOE approach allows for the definition of multiple 
objective functions such as performance loss functions and hence allows the optimization (e.g. 
minimization) of a suitable performance measure while meeting constraints for input and response 

25 variables Once these local models are generated, the nonlinear adaptive controller is initialized 
using the models. The approach used in this invention is rigorous, and it addresses the fundamental 
issue of nonlinearity in the uniformity surface response. The effects of uncontrolled variables, of 
variable interactions, and of second-order terms on the performance measure can be better 
accounted for using quadratic models. In general, linear models are often a sufficient 

30 approximation to the true behavior of the system far from the optimum but they are not very good 
for describing response surfaces in the region of the optimum. This is because the region of the 
optimum usually shows curvature that cannot be explained by linear relationships. Curvature is 
always accounted for by higher order terms Furthermore, when interaction is present in multi- 
factor systems, linear models cannot adequately describe the "twisted plane" that results from the 

35 interaction. The controller of this invention is able to account for all these factors since it is a 
nonlinear controller. 
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Purposes, Objects, and Advantages 

A major purpose of the invention is to provide an adaptive control system capable of 
automatically controlling a wide variety of complex processes despite changes in process 
parameters and despite drifts in the controlled process A related purpose is to provide methods by 
5 which such a system can be implemented. 

Thus an important object of the invention is system for controlling a multi-variable process 
that cannot be readily modeled by physical equations Another important object is a process 
control system that can detect the incidence of manufacturing problems after only a few products 
are manufactured. Another object is a process control system that can predict the possibility of 
10 manufacturing off-specification products. A related object is a system that can sound an alarm 
before off-specification products are manufactured, so as to avoid waste. These and other 
purposes, object, and advantages will become clear from a reading of this specification and the 
accompanying drawings. 

Understanding of the present invention will be facilitated by consideration of the following 
15 detailed description of a preferred embodiment of the present invention, taken in conjunction with 
the accompanying drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

20 FIG. 1 is a diagrammatic view of a control system made in accordance with the present invention. 

FIG. 2 is a logic flow chart implementing a preferred embodiment of the present invention. 

FIG. 3 is a graph showing historical data of the optimization phase of an example process. 

FIGS. 4a, 4b, 5a and 5b show surface response plots of two input variables and two output 

variables of an example process 
25 FIGS. 6a and 6b are graphs showing simulated output and real process output for an example 

process. 

FIGS. 7a and 7b are graphs showing real process data compared to model predictions for an 
example process. 

30 MODES FOR CARRYING OUT THE INVENTION 

A general description of the adaptive control system in this invention is illustrated in FIG. 1 
where the control system strategy is applied to the photoresist-coating of wafers in semiconductor 
integrated circuit (IC) manufacturing. The object here is to use the adaptive control system to 
35 obtain a photoresist film of specified mean thickness with the best possible uniformity by 
appropriately choosing the input variables or recipes to achieve this in the face of changes in 
process parameters. The process has about fourteen potentially important input variables and 
three output variables. The output variables are the cross-wafer mean thickness, the cross-wafer 
standard deviation or uniformity, and photoresist dispense volume The dispense volume is 
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actually an input variable which is also required to be minimized. In FIG. 1, T and s p denote the 
predicted thickness and standard deviation respectively while T d and s d respectively represent the 
desired thickness and standard deviation. 

As a first step to the application of this adaptive control system, a nominal empirical model 

5 of the system is obtained. This can be done by optimizing a compound performance loss function 
using sequential design of experiment techniques. This step can be replaced with parallel design of 
experiment techniques or by using physical models of the process The optimization process will 
eventually identify the key input variables and the variables that have little or no influence on the 
response variables can be considered as constants The resulting model in terms of the key 

10 variables may be of the form 

F(x) = C + X [G(i) * dX(i)) + [//(/, j) * dX(i) * dX{ J )] Equation (1) 

where, 

N = Number of input variables 
C = Constant term 
15 R = Reference vector 

dX = X - R (If R = 0 , then dX = X \ X is a vector of input variables - the recipe) 
G is the gradient vector 
H is the Jacobian matrix 

20 The series of steps in implementing this control system strategy is illustrated using a 2-variable ( jc, , 
and x 2 ) process as an example. The quadratic model of a 2-variable process can be represented as 

y = do + (3, X, + a 2 X 2 + CtuX? + a 22 X 2 2 + a 12 JC, 2 Equation (2) 

Which in vector form will be 
25 y - <f> T 6 0 Equation (3) 



where, 

<f> T =D X\ x z Xx 2 x 2 jc I2 ] 

^o r =[a 0 fli a 2 a }] a 22 a X2 \ 

30 



Unlike future outputs of autoregressive moving average models, which can be expressed by 
recursive relations in terms of past outputs and past and present inputs, the resulting models given 



BNSDOCID: <WO 971230OAl_L> 



WO 97/12300 




PCT/US96/15277 



by Equations (1) and (2) are static, and hence cannot capture the dynamic behavior of a process. 
Hence, they lack the ability to be used in predicting the output of the process in a time-series sense. 
The calculation of the error term 

e{t) = y(t)-y(t) Equation (4) 

5 in the least squares estimation process requires knowledge of the predicted output 

y(t) = #(t _ 1) T 0(1 - 1) Equation (5) 

This information is not available in models obtained by experimental design techniques. Hence, the 
need for a method of predicting future outputs is mandatory. In this invention polynomial 
extrapolation is applied to available historical data, in predicting the mean thickness and standard 

10 deviation for the next run. This then replaces what can be seen as the predictive pan of 
autoregressive moving average models. What this suggests is that parameter estimation and 
prediction should be done somewhat independently with prediction following estimation in the 
same loop. Thus, two least squares estimators were used in implementing this control system 
strategy. The first least squares estimator is used in modeling the process behavior, up to the 

15 current run, by choosing parameters of the quadratic model such that the error between the actual 
responses measured and the model parameters are minimized in a least squares sense. Polynomial 
extrapolation is then used, in conjunction with historical data, to obtain the thickness and standard 
deviation of the next run. Then these extrapolated response values are used, as if they are the real 
outputs, to update the parameters of the second estimator. Thus, the resulting parameters of the 

20 second estimation process can be used to predict the responses of the drifting process if the recipe 
is known. If we were to continue to use the current recipe, the process would continue to drift. 
Given this current process drift as captured by the parameters of the second estimation process, 
this invention provides a way to find the correct recipe to apply in order that we cancel out the 
drifting trend and simultaneously satisfy all our response targets. 

25 As an illustrative example, this invention was applied to the spin-coating of wafers in IC 

manufacturing. The problem statement is that a system is desired that would model the process of 
depositing a specified thickness of photoresist thin film (a 1000 nanometers (\\±n\) film was studied 
in this example case), with the best possible uniformity using the least quantity of photoresist 
chemical. To apply this invention to the spin-coating process, the latter was exercised through a 

30 number of run/advice cycles of an optimization routine The results are summarized as shown in 
FIG. 3. The figure clearly shows that the photoresist dispense volume was dramatically reduced 
from about 8 milliliters to about 4.3 milliliters for a 1000 nm film. This amounts to almost 50% 
reduction in chemical usage and translates to millions of dollars of savings for a typical modern 
semiconductor fabricating facility. It can be seen that the standard deviation reduced steadily run- 

35 by-run. The cross- wafer mean thickness, however, remained fairly constant at 1000 nm. The 
photoresist dispense volume is an input variable but it is also required to be minimized and so it is 
also defined as a calculated output variable. For physical constraint reasons the dispense volume 
could not be reduced below 4.3 milliliters, without affecting the quality of the films. So, the 
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optimum value of the volume was set at 4.3 milliliters and assumed as a constant thereafter 
Hence, hereafter only the cross-wafer mean thickness and standard deviation are considered as 
output variables to be optimized. 

Models are generated from the optimization process above and several information such as 
5 the key variables and percentage contribution of the input variables to the response variables can be 
deduced from the models. 3D surface response plots can be generated from the models to give a 
pictorial view of the variable dependencies etc. FIGS. 4 and 5 show 3D and contour plots of the 
film mean thickness and standard deviation with two independent variables, dispense speed and 
spread speed Note that even though the optimization process drove the process to the 
10 neighborhood of the desired targets as shown in FIG. 3, it still exhibited large run-by-run variability 
particularly in the mean thickness and standard deviation Thus, to reduce the run-by-run 
variability, the models generated from the optimization phase are used in initializing a novel 
adaptive controller. 

The chronological sequence of events involved in engaging the novel adaptive controller 
15 from the start of the process is as follows. After the first wafer is spin-coated with photoresist, a 
cross-sectional mean thickness measurement is obtained using a measurement tool. Thereafter the 
real mean thickness T r and sample standard deviation s r are computed. Then, initial parameter 
values of the first estimator are chosen using the model coefficients determined from the 
optimization phase and these parameters are updated with the new process data as follows Given 
20 the current recipe x , and the current parameters a , (x , and a are vectors) the model thickness 
and standard deviation can be calculated from Equation (1) as f w (a, x)and s m (a,x) respectively. 
A new set of parameters is chosen by minimizing the error between the model response and the 
real response. That is, the problem definition is: 

min [T r - F m (a, x )f Equation (6) 

25 subject to a 



min l s r ~ s „(*, x)] 2 Equation (7) 



subject to b 
where 



T m (^ f x) - a 0 + a!Xi + a2X 2 + a n x] + a n x i2 
s m {b,x) = b 0 + biXi + b 2 X2+ b n x] + bi 2 x }2 



30 



35 The actual iterative equations for implementing the least squares algorithm of Equations (6) and 
(7) are given below in Equations (8) and (9) as 
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n (t ) -0( t -n + fffZ-Mf ]) [y ( t ) - _ \f§(t - 1)] Equation (8) 



«/-n- «/ - 2) - ^-2)*'-;)*('-')^('-2> , > , Equation (9) 

with given initial estimate, 
5 " 0(0), P(-\) = kP 

where * is a large constant and P is any positive definite matrix, typically the identity matrix /; 
0(t\ and <f>(t) have already been defined in Equation (3). 

To verify that this procedure actually succeeds in tracking the response variables, the least 

10 squares process was applied to a set of historical data and the result is presented in FIGS. 6a and 
6b. From the graph it can be seen that the model developed tracks the response variables very 
well. However, since the response variables (thickness and standard deviation) are available after 
the fact, we expect that the estimator results will lag behind the current process by one run and this 
expectation is actually confirmed in the plots. This again affirms the need for the incorporation of 

15 a method of prediction in order that the estimator model output predictions will synchronize with 
the system outputs. 

So, after two wafers are processed and the first estimator parameters are updated, 
sequentially after each data point is available using Equations (8) and (9), we obtain the model 
equations T m2 s m2 as given by Equation (6) and (7). Then, the first two data points are used as 

20 starting points in doing polynomial extrapolation to predict the value of the outputs for the third 
run; thereafter, the second estimator is engaged. _ 

Let the extrapolated mean thickness and standard deviation for the third run be T t3 and s ei 
respectively. Then using parameters a, b estimated by the previous estimator, as the initial 
parameter guess for the second estimator, the optimization problem becomes: 

min [T ti - f m2 (a)] 2 Equation (10) 



subject to a , 

min [s ti - J n2 (b)f Equation (11) 

subject to b . 
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10 



10 

These new parameters a, b pertain to the state of the process one step ahead assuming 
that the same recipexis used. From these new parameter values, we can compute the new 
predicted thickness T pi and standard deviation s pi using Equation (1). Thus, if the process is 
drifting we could still predict the mean thickness and standard deviation one step ahead, given that 
we maintain the recipe as_the previous run value. If our desired thickness and standard deviation 
targets are respectively, T d , s d then we can back compute to find the recipe that ought to be used 
in the next run to prevent the process from drifting. Hence, the problem becomes that of choosing 
a recipe ac such that the targets, T d , s d are simultaneously met. Thus, we want 



T (*J P 3 ~ T d => T(x) p3 - T d = 0 Equation (12 



s (*) P i - Sd => s(x) p3 - s d = 0 Equation (13 



Thus, the recipe to use on the next run to get the mean thickness and standard deviation to target 
is found by simultaneously solving the two nonlinear Equations (12) and (13) This procedure is 
repeated till the system converges to the targeted values. 
15 A summary of the series of steps for the implementation of the above adaptive control 

system is shown in FIG. 2 and outlined below: 
1. Initialize the adaptive controller by choosing appropriate initial parameter values Using 
these parameters and the nominal recipe values x, compute the model predicted 
thickness 7V, and sample standard deviation, s ml using Equation (1). 
20 2. Process first wafer and compute the real mean thickness T rl and sample standard 
deviation s rl . 

3. Compute the resulting error between the model prediction in 1 and the real process 
results in 2. Using this error, in conjunction with the least squares process, update the 
adaptive controller parameters as shown in Equations (6) and (7). 
25 4. Process next wafer and compute 7;, and J,, . 

5. Update the model parameters again using the last processed wafer 

6. Using available real process data, do polynomial extrapolation to determine the predicted 
thickness T e3 and predicted standard deviation I, 3 . 

7. Update the adaptive controller parameters and compute the new model predicted 
30 thickness T p2 and standard deviation s p2 This will be the state of the process in the next 

run if the previous recipe is used. 

8. Determine the optimum recipe x , to use to drive the process to the desired targets T d 
and s d by simultaneously solving the two Equations (12) and (13). 
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9. Using the recipe obtained in 8 as the current recipe, go to 4 and loop till a stopping 
condition is met or till the process is stable enough to allow disengagement of the 
adaptive controller, if necessary. 

5 The results obtained by applying this procedure to our test process is summarized in FIGS. 

7a and 7b. It can be seen that even though the controller was initialized with parameters that 
resulted in large initial errors, the controller still converged to the optimum point in a few steps 
This feet was confirmed by running a number of experiments and all the results were in close 
agreement. This shows that the operation of the adaptive controller is not influenced very much by 

10 the initial choice of the parameters and that it tends to drive the system towards the optimum 
operating point. Indeed, for the least squares estimation process, parameters converge after n 
runs, where n is the number of parameters to be estimated. In this experiment, the calculation of 
the rate of convergence is complicated by the fact that we are dealing with two adaptive controllers 
running in parallel: one for the thickness and the other for the standard deviation. Moreover, the 

15 update of the parameters is done twice in a loop. In spite of this complication, we can see that the 
controllers are well bounded and converge. It is interesting to note that as more weight is assigned 
to the thickness performance measure, the adaptive controller for thickness converges to the target 
thickness of 1000 nm. However, the weight placed on standard deviation is decreased accordingly 
and so it was difficult to maintain it at 1.5 nm After different weights were assigned to the 

20 thickness and standard deviation criteria, it became clear that the tool was incapable of delivering 
more stringent process requirement. At run number 8 (wafer no. 8), the thickness obtained was 
approximately 1000 nm with a standard deviation of about 2.3 nm. After run 9 (wafer no 9), the 
system had settled to the final steady-state value and the standard deviation had started leveling 
out. It was not possible to get more than ten runs in an uninterrupted sequence with the setup 

25 available for the illustrated example. 

From the models developed, the sensitivity of the process to changes in the input variables 
was studied and the key variables were identified. From the sensitivity studies it was clear that a 
tool that could deliver products with tight tolerances will require that at least some, if not all, of 
the key process variables have good resolutions and tolerances. This information may in turn 

30 serve as a good input to specifying the tolerances and resolutions of components and devices to 
use in building equipment. For example, the speed resolution and regulation of electric motors to 
be used in building electrical equipment, and the resolution and accuracy of sensors to use in 
designing electrical tools, etc. can be specified. 

The functional elements of the process and system of FIGS. 1 and 2 may be discrete 

35 components or modules of a software program run on a known computer Alternatively, they may 
be discrete electrical or electronic components capable of performing the functions described 
herein. It is believed that one of the ordinary skill in the art having the above disclosure before him 
could produce these components without undue experimentation. 
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INDUSTRIAL APPLICABILITY 

This invention provides an automated means of optimizing a process and minimizing the 
run-by-run variability. A novel adaptive controller is used to estimate the characteristics of a 
process. Polynomial extrapolation is used in predicting future outputs and in conjunction with a 
second adaptive estimator, a model representing the drifting process can be obtained. Based on 
this model the correct recipe to use to cancel out the drifting trend can then be computed and 
applied to prevent the process from drifting. 

While the invention has been particularly shown and described with reference to a preferred 
embodiment thereof, it will be understood by those skilled in the art that various other changes in 
the form and details may be made therein without departing from the spirit and scope of the 
invention. 

What is claimed is: 
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CLAIMS 

1. An adaptive control system for controlling a process described by physical equations or empirical 
models, said process having input variables and having response variables to be controlled, said 
5 adaptive control system comprising: 

(a) a multiplicity of adaptive controller means, equal in number to the number of response 

variables to be controlled, said adaptive controller means operating in parallel and 
concurrently; 

(b) a multiplicity of estimators for estimating any time-varying components of said response 
10 variables of process, including one estimator for each response variable; 

(c) means for predicting estimated future values for said response variables of said process; and 

(d) means for varying said input variables to control said future values for said response 

variables to minimize said time-varying components in accordance with a desired result of 
said process. 

15 2. The adaptive control system of claim 1 wherein said means of predicting future values for said 
response variables of said process comprises polynomial extrapolators. 

3. The adaptive control system of claim 1 wherein said means for predicting future values for said 
response variables of said process comprises polynomial extrapolation of recorded historical data 
characterizing each of said response variables. 

20 4. The adaptive control system of claim 1 wherein said means for varying said input variables to 
control said future values for said response variables to minimize said time-varying components 
comprises means of solving n nonlinear simultaneous equations, where n is the number of said 
response variables to be controlled. 

5. A method for adaptively controlling a process operating on a sequence of samples according to a 
25 recipe having recipe values, said method comprising the steps of: 

(a) initializing an adaptive controller by setting initial parameter values and setting nominal 
recipe values; 

(b) computing a model-predicted parameter value using equation (1); 

(c) processing a first sample of said sequence, measuring a first parameter of said first sample 
30 multiple times to obtain a mean value and sample standard deviation for said first sample; 

(d) computing the resulting error of said first parameter and updating the parameters of said 
first sample by using equations (6) and (7); 

(e) extrapolating to find the value of the next sample point and using it, as if it is the true 
sample, to update adaptive controller parameters by using equations (10) and (11); 
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(f) computing the optimum recipe to use on the next run by using equations (12) and 
(13); and (g) repeating the above steps (b) through (f) until a stopping condition is 
met.6. The method for adaptively controlling a process of claim 5, wherein said 
stopping condition includes a minimum value for time dependence of at least one of said 
5 parameter values. 

7 An automatic control system having adaptive controllers to track and control processes subject 
to time-dependent input parameters, characterized in that: 

said automatic control system has means of controlling systems and processes using polynomial 
approximations, and has parameter estimators equal in number to the number of response 
10 variables to be controlled. 

8. An automatic control system for controlling a process described by physical equations or 
empirical models, said process having input variables and having response variables to be 
controlled, said automatic control system comprising a computer of known type, programmed 
with instructions to perform the steps of: 

15 (a) initializing an adaptive controller by setting initial parameter values and setting 

nominal recipe values; 

(b) computing a model-predicted parameter value using equation (1); 

(c) processing a first sample of said sequence, measuring a first parameter of said first 
sample multiple times to obtain a mean value and sample standard deviation for said 

20 first parameter of said first sample; 

(d) computing the resulting error of said first parameter and updating the parameters of 
said first sample using equations (6) and (7); 

(e) extrapolating to find the value of the next sample point and using it, as if it is the 
true sample, to update adaptive controller parameters using equations (10) and 

25 (11); 

(f) computing optimum recipe values to use on the next run using equations (12) and 
(13) and substituting said optimum values into equation (1); and (g) repeating the 
above steps (b) through (f) until a stopping condition is met. 

9. The automatic control system of claim 8, wherein said stopping condition includes a minimum 
30 value for time dependence of at least one of said parameter values. 



BNSDOCID: <WO 9712300AlJ_> 




BNSDOCID: <WO 9712300A1J_> 



WO 97/12300 




PCI7US96/15277 



3/11 



Mean Thickness vs. Run no. 




250 



Run No. 



Standard Dev vs. Run no. 



12 
10 

E 8 

> 6 
to 

55 4 

2 
0 





1 ! 1 1 




























i i 





50 



100 



150 



200 



250 



Run No. 




BNSDOCID: <WO 9712300A1 J_> 



WO 97/12300 




PCT/US96/15277 



4/11 



THICKNESS 3D PLOT 
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FIG. 4b 
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